Online implicit agent modelling

نویسندگان

  • Nolan Bard
  • Michael Johanson
  • Neil Burch
  • Michael H. Bowling
چکیده

The traditional view of agent modelling is to infer the explicit parameters of another agent’s strategy (i.e., their probability of taking each action in each situation). Unfortunately, in complex domains with high dimensional strategy spaces, modelling every parameter often requires a prohibitive number of observations. Furthermore, given a model of such a strategy, computing a response strategy that is robust to modelling error may be impractical to compute online. Instead, we propose an implicit modelling framework where agents aim to estimate the utility of a fixed portfolio of pre-computed strategies. Using the domain of heads-up limit Texas hold’em poker, this work describes an end-to-end approach for building an implicit modelling agent. We compute robust response strategies, show how to select strategies for the portfolio, and apply existing variance reduction and online learning techniques to dynamically adapt the agent’s strategy to its opponent. We validate the approach by showing that our implicit modelling agent would have won the heads-up limit opponent exploitation event in the 2011 Annual Computer Poker Competition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Robust Feedforward Active Noise Control System with a Variable Step-Size FxLMS Algorithm: Designing a New Online Secondary Path Modelling Method

Several approaches have been introduced in literature for active noise control (ANC)systems. Since Filtered-x-Least Mean Square (FxLMS) algorithm appears to be the best choice as acontroller filter. Researchers tend to improve performance of ANC systems by enhancing andmodifying this algorithm. This paper proposes a new version of FxLMS algorithm. In many ANCapplications an online secondary pat...

متن کامل

Implicit iteration approximation for a‎ ‎finite family of asymptotically quasi-pseudocontractive type‎ ‎mappings

In this paper‎, ‎strong convergence theorems of Ishikawa type implicit iteration‎ ‎process with errors for a finite family of asymptotically‎ ‎nonexpansive in the intermediate sense and asymptotically‎ ‎quasi-pseudocontractive type mappings in normed linear spaces are‎ ‎established by using a new analytical method‎, ‎which essentially‎ ‎improve and extend some recent results obtained by Yang‎ ‎...

متن کامل

Modelling Implicit Communication in Multi-Agent Systems with Hybrid Input/Output Automata

We propose an extension of Hybrid I/O Automata (HIOAs) to model agent systems and their implicit communication through perturbation of the environment, like localization of objects or radio signals diffusion and detection. To this end we decided to specialize some variables of the HIOAs whose values are functions both of time and space. We call them world variables. Basically they are treated s...

متن کامل

GOLDSMITHS Research Online Book Section Luck, Michael and d'Inverno, Mark Engagement and cooperation in motivated agent modelling

The title of this paper suggests two distinct aspects of the models that we propose and consider. The rst of these is the modelling of other agents by motivated agents. That is to say that the act of modelling is itself motivated and constrained by the agent doing that modelling. The second aspect is that all such models will also be of motivated agents. It is not su cient merely to know what o...

متن کامل

Optimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics

In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information. Detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. The introduced reinforcement learning-based algorithms learn online the approximate solution...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013